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Abstract-Tius paper provides a review of automating or semi-automating the seed quality 
purity test. Computer vision (CV) technology used in variety of industries is a sophisticated 
type of inspection technology; however, it is not widely used in agriculture.The application 
of CV technologies is very challenging in agriculture. As CV plays an important role in this 
domain, research in this area has been motivated. Several theories of automating seed 
quality purity test are briefly mentioned. The reviewed approaches are classified according 
to features and classifiers. The methods for extracting features of a particular seed, and the 
classifiers used for classifying the seeds, are mentioned in the paper. An overview of the 
most representative methods for feature extraction and classification of seeds is presented. 
The major goal of the paper is to provide a comprehensive reference source for the 
researchers involved in automation of seed classification, regardless of particular feature or 
classifier. 

Index terms- Computer vision, Image processing, feature extraction, classifiers. 

I. Introduction 

The application of computer vision technologies includes capturing, processing and analyzing images to 
make the objective and non-destructive assessment of visual quality characteristics in agricultural and food 
products easy. Image analysis includes the techniques of image acquisition, image pre-processing and image 
interpretation, resulting in quantification and classification of images and objects of interest in the 
images. Grain kernels considered to be agricultural objects are of variable size, shape, color and texture. 
Moreover, by growing region within a year, and even over a single growing season, these features may vary 
from year to year. Machine vision has the potential to be used for the detection of insect and mite infestation, 
color classification and identification of damaged kernels. 

The basic and most important input for increasing agricultural production is Seed. For certification, the 
seeds go through the process referred to as seed testing. In practice, purity test is done by the seed 
technologist who views the video screen or views the lot underneath a magnifying lens to spot seeds of other 
species and weeds and removes them with tweezers. Generally one technician spends eighty percent of his 
total employment to perform this task five months each year in normal conditions. Though all the technicians 
of an organization receive a similar coaching, the classification method isn't uniform, because it depends on 
their ability and even on personal circumstances. Issues like eye fatigue, call variations between inspectors 
are faced owing to human involvement throughout the test procedure. To overcome these problems more and 
more automation is needed. 
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At this stage the machine vision system is used to decide which extracted features are relevant for 
furtherprocessing. Common features to be extracted from the input data in the sector of agriculture are seeds, 
fruits, flowers, vegetables and geographic features. 



II. Organization of the article 



Automation of seed classification is based on classical methods which rely on feature extraction and 
classifiers. This work attempts to incorporate both the elements since they are closely related and is organized 
as follows. Section III emphasizes the survey of work done in the area. Section IV present seeds and their 
features that are extracted for discriminating the seed from the lot and Section V addresses seeds and the 
classifiers used for classifying them. At the end there is a bibliography which the reader may use to further 
explore the field. It is, by no means an exhaustive, but intended to serve as a starting point and direct the 
reader to characteristic research in this area. 



III. The survey of work done in the research area 

Substantial work in seed technology-seed purity test using image processing has been reported. Fig. 1 shows 
the seed technology development at a glance. The work has been categorized depending on different types of 
seeds 



A. Weed. Pablo M. Granitto et.al[l,2] assessed the discriminating power of size, shape, color and texture 
characteristics for the unique identification of 57 weed species using the Naive Bayes classifier. Size and 
shape characteristics were found to have larger discriminating power than color and textural ones. 

T.F.Burks et.al. [3] Reported Colour Co-occurrence Method (CCM) texture analysis techniques to evaluate 
three different neural-network classifiers for potential use in real-time weed control systems. A comparison 
study of the classification capabilities of three neural-network models was conducted. It was found that the 
Back Propagation Neural-Network (BPNN) classifier provided the best classification performance with 
96.7% accuracy. 
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Fig. 1: Seed technology development at a glance 

B. Corn. Xiao Chena et.al. [4] Presented a method for classifying five corn varieties. The image processing 
techniques, stepwise descriminant analysis, the Mahalanobis distance analysis and the BPNN were used. A 
two-stage classifier was developed for identifying which combined the Mahalanobis distance analysis and the 
BPNN. The classification accuracies found were between 88 to 100% for various varieties. 
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Cao Weishi et.al. [5] Presented a maize purity identification calculation based on Discrete Wavelet transform 
and BPNN. The identification accuracies were found to be 94.5%. 

Min Zhao et.al. [6] Proposed a real-time, accurate and objective identification of different varieties of corn 
seeds. Color, texture and shape features, were extracted. Genetic algorithm and SVM were used to select and 
determine species. The proposed methods have achieved the best performance percentage of 94.4%. 

C. Wheat. AlirezaPourreza et.al. [7] Applied machine vision techniques to classify wheat seeds based on 
their varieties. Several textural feature groups of seeds images were examined to evaluate their efficacy in 
identification of nine common Iranian wheat seed varieties. LDA (linear discriminate analysis) classifier was 
employed for classification using top selected features. The average classification accuracies found were 
98.15%. 

Marian Wiwart et.al. [8] Presented a method to identify hybrids of spelt and wheat based on shape and color 
descriptors using principal component analysis. 

D. Areca nuts. Kuo-Yi Huang [9] presented an application of neural networks and image processing 
techniques for detecting and classifying the quality of areca nuts. Defects with diseases or insects of areca 
nuts were segmented by a detection line method. Six geometric features, three color features and defects area 
was used in the classification procedure. To sort the quality of areca nuts a BPNN classifier was employed. 
The classification accuracies found were 90.9%. 

E. Cotton. Li Jinbig et.al. [10] Proposed a nonlinear identification method based on BPNN and 
investigatedthree varieties of delinted cottonseeds. The color and shape characteristics parameters were 
selected. It was found that BPNN identification method had higher accuracy than the step discrimination 
method and the test accuracy rate was 90%. 

Jamuna et.al. [11] Employed machine learning approach to classify the quality of seeds based on the different 
growth stages of the cotton crop. Machine learning techniques such as Decision Tree Classifier, Naive Bayes 
Classifier and Multilayer Perceptron were applied for training the model. The results obtained shows that 
Decision Tree Classifier and Multilayer Perceptron provide the same accuracy but the time taken to build the 
model is higher in Multilayer Perceptron as compared to the Decision Tree Classifier. 

F. Rice. Liu Zhao-yan et.al. [12] Developed a digital image analysis algorithm based on color and 
morphological features to identify the six varieties of paddy. 

G. Bulk grains. KantipKiratiratanapruk [13] proposed a method to classify more than ten categories of seed 
defects by using color, texture features and support vector machine (SVM) type classifier. 

AdjemoutOuiza et.al. [14] Emphasized on the pattern recognition aspects and four hundred samples of each 
of four species of seeds, namely corn, oat, barley and lentil were considered. The recognition procedure was, 
made on the basis of shape features and texture features, separately. Features space reduction was done using 
the Principal Component and clustering operation was done based on the k-means algorithm. 

H. Rubber. HadzliHashim et.al. [15] Developed an intelligent model for classifying selected rubber tree 
series clones based on shape features using image processing techniques. Shape features were extracted from 
each image. Two models were being designed. It was shown that the optimized Model 2 has the best 
accuracy of 84% with more than 70% achievement for sensitivity and specificity. 

rv. Seeds and features 

Different features need to be extracted for proper classification of seeds. Different seeds have their own 
features which are extracted and given for further processing. Substantial work in seed feature extraction 
using image processing has been reported. They are categorized depending on different types of seeds 

A. Color. Pablo M. Granitto et.al. [1] Extracted features of the weed seeds. Gray level histograms in the /, r, 
gchannelswere calculated. From these histograms standard features such as average, variance and skewness 
were measured. Three ratios of average histogram values in the RGB channels: E[R]/E[G], E[K\IE\T\ and 
E[G]/E[[] were calculated (where E[.] means the average pixel value in the corresponding channel). In total 
12 different color characteristics were measured. The final four parameters selected for classification are the 
following: Variance of the intensity histogram [A/2 (7)], Skewness of the intensity histogram 
[M3(7)/M2(7)3/2], Ratios of average pixel values in RGB channels [E(R)/E(r>, E(G)/E(T)]. 
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Xiao Chena et.al. [4] Found a total of 58 features to be extracted for identifying corn varieties, including 30 
morphological features and 28 color features. Color features have been widely used to classify grain varieties. 
But different from most grains, colors of a corn kernel are not quite uniform. Both the germ and tipcap of 
corn are typically white, whether the whole kernel is white or yellow. To study the effect of color features on 
the identification performance of corn varieties, four transformations of RGB color space were evaluated, i.e., 
rgb(normalized RGB values), YC b C r , I1L.I3, and HSV. Furthermore, 28 features such as mean and standard 
deviation of these color components were calculated. From these the 18 colorfeatures (Mean and standard 
deviation of B, g, G, R, S, H, I 3> I 2 , r) were selected using stepwise selection for recognizing three types 
(white, yellow and mixed) corn kernels and finally six colorfeatures (standard deviation of g, r, Ii, mean of r, 
g,) were selected for recognizing the three varieties of yellow corn kernels using stepwise discrimination. 
Cao Weishi et.al. [5] Obtained the RGB color model character parameters of the maize seed crown part, and 
then the three color values obtained were processed and analyzed by the two-level DWT. The feature value 
was the average of the color component pixels of each maize seed crown core area. After the three color 
component two-level DWT of the same objective image, the average of every band was calculated separately 
and 21 characteristic parameters were acquired such as V= { BLL2, BHL1, BLH1, BHH1, BHL2, 
BLH2,BHH2, GLL2,GHL1, GLH1,GHH1,GHL2, GLH2, GHH2,RLL2,RHL 1 , RLH1 ,RHH1 ,RHL2 ,RLH2 
,RHH2 } . 

Min zhaoet.al. [6]Obtainedtwelve features of corn seed's color mean and standard deviation of red, green and 
blue, the mean and standard deviation of hue, saturation and intensity from the acquired images. 
Marian Wiwart et.al. [8] Performed the coloranalysis based on the average values of variables R G B for 
every ROI, which were then used to calculate the values of H S I and Lab. 

Kuo-Yi Huang [9] calculated a pair of orthogonal eigenvectors of the covariance matrix. The color features — 
Rm, Gm, and Bm (i.e., the mean gray level of areca nut on the R, G, and B bands) of the entire areca nut 
were computed using eigenvectors. 

Li Jingbinet.al. [10] Used RGB color model and HSI color model. The 12 color characteristic parameters , 
the mean and standard deviation of the color characteristics, including R (red), G (green), B (blue), H (hue), 
S(saturation), and I (luminance) were defined for cotton seeds. 

KantipKiratiratanapruk [13] adopted color histograms in the RGB and HSV color space with eight bins in 
each color channel for corn identification. 

B. Morphology. Pablo M. Granitto et.al. [1] extracted features corresponding to morphological of the weed 
seeds. Size and shape characteristics of seeds were obtained from Binarized images. The lengths of the 
principal axes and several moments of the planar mass distribution with respect to those axes, the ratio of its 
area to the seed area (compactness) and the size of the minimal rectangular box containing the seed were 
measured. In total 2 1 morphological features were measured. The final six parameters that were selected for 
classification are Ratio of semi-axis lengths of the main principal axis [hl/h2], Ratio of seed and enclosing 
box areas [A/(M+/i2)x(vl+v2)], Square root of seed area [SQRT(A)], Moments of the planar mass 
distribution with respect to the principal axes [M20,M21,M22]. 

It was concluded that morphological features has the large discriminating power, color and texture were less 
reliable, morphology plus color features have an edge over the combined use of morphology and texture. 
Xiao Chenaet. al. [4] found a total of 58 features to be extracted for identifying corn varieties, including 30 
morphological features. 

Min zhaoet.al. [6] extracted the 11 geometric features of corn kernels based on binary image including 
contour points, perimeter, area, circular degrees, equivalent diameter, major length, minor length, stretching 
the length of the rectangle, maximum inscribed circle, the smallest excircle. 

Marian Wiwart et.al. [8] Determined the following descriptors for the image of each wheat kernel represented 
by a single blob (ROI - region of interest): Area, Perimeter, Circularity, Feret Diameter, Minimal Feret 
Diameter, Aspect Ratio, Roundness, and Solidity. 

Kuo-Yi Huang [9] calculated a pair of orthogonal eigenvectors of the covariance matrix. The geometric 
features, the principle axis length (Lp), secondary axis (Ls), the centroid, axis number (Lp/Ls), area (A), 
perimeter (P), compactness (4nA/P 2 ) were computed using eigenvectors for areca nuts. 

Li Jingbinet.al. [10] extracted fourteen shape characteristic parameters of cottonseeds the Area , Perimeter , 
NCI ratio , Circular degree , Center of gravity X , Center of gravity Y, Major diameter, Short diameter, 
Second moment X (Mx2), Second moment Y (My2), Second moment XY (Mxy), Major axis of oval , Short axis 
of oval , Shape coefficient of oval. 
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Adjemoutouiza et.al. [14] Made the recognition procedure on the basis of shape features separately forCorn, 
Oats, Barley, and Lentil. 15 shape features the perimeter, the surface, the circularity, Major axis, minor axis, 
Hu'smoments and central moments of second order were calculated from the pre-processed images. 

HadzliHashim et.al. [15] Proposed two models for automated rubber seed clones classification. The first 
model (known as Model 1) uses 38 shape features (area, perimeter and 36 radius components) as the inputs, 
while the second (known as Model 2) utilizes only the input size reduction after applying PCA. 

C. Texture. Pablo M. Granitto et.al. [1] Extracted features corresponding to textural characteristics of the 
weed seeds. Two different matrices Gray level co-occurrence matrix and Gray level run length matrix were 
used to describe seed surface texture. The final two parameters selected for classification are contrast along 
the main principal axis direction and Cluster Prominence along the secondary principal axis direction. 

Min zhaoet.al. [6] Obtained texture feature such as mean, variance, smoothness, third moment, consistency, 
entropy and 7 statistical invariant moments from the gray image were obtained. 

AlirezaPourreza et.al. [7] extracted 131 textural features, including 32 gray level textural features (mean, 
standard deviation, smoothness, third moment, uniformity, entropy, gray level range and 25 histogram 
groups), 31 LBP features (mean, standard deviation, smoothness, third moment, uniformity, entropy and 25 
histogram groups), 31 LSP features (mean, standard deviation, smoothness, third moment, uniformity, 
entropy and 25 histogram groups), 15 LSN features (mean, standard deviation, smoothness, third moment, 
uniformity, entropy and histogram of LSN matrix containing nine features), 10 gray level co-occurrence 
matrix (GLCM) features (mean, variance, entropy, uniformity, homogeneity, inertia, cluster shade, cluster 
prominence, maximum probability and correlation) and 12 gray level run-length matrix GLRM features 
(short run, long run, gray level non-uniformity, run ratio, run length non-uniformity, entropy, low gray level 
run, high gray level run, short run low gray level, short run high gray level, long run low gray level and long 
run high gray level) for each monochrome image of the bulk wheat samples. 

KantipKiratiratanapruk [13] adopted texture features such as energy, contrast, correlation and homogeneity 
based on Grey level co-occurrence matrix (GLCM) and Local binary pattern (LBP) for corn image 
classification. 

Adjemoutouiza et.al. [14] Made the recognition procedure on the basis of texture features, separately 
forCorn, Oats, Barley, and Lentil. Spatial gray-level dependence method were used for extracting texture 
features such as second angular moment (SAM) which gives information about the homogeneity of texture, 
contrast (CONT) which measures the local variation of texture and supports the great transitions from the 
grey levels, entropy (ENT) which evaluates the degree of organization of the pixels, variance (VAR), 
differential inverse moment (EVl) and correlation (COR). 

Fig: 2 shows the graphical presentation of seeds and features extracted from that seed 
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Fig. 2: Seed and Features 
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V. Seeds and classifiers 

Features extracted are given as an input to the classifiers for proper classification of seeds. Substantial work 
in seed classification using pattern recognition has been reported. They are categorized depending on 
different types of seeds. 

Fig: 3show the graphical presentation of seeds and classifiers used for classifying that seed. 

A. Weed. Pablo M. Granitto et.al. [1, 2] compared the naive bayesclassifier, single ANN and the structuring 
ten networks in a committee (with 2 options majority rule and added probabilities). It was proved that the two 
ANN committee implementations were better than the naive bayes and single ANN classifier. 

T.F.Burks et.al. [3]Conducted a comparison study of the classification capabilities of three neural-network 
models (backpropagation, counterpropagation, and radial basis function). It was found that the 
backpropagation neural-network classifier provided the best classification performance and was capable of 
classification accuracies of 97%, which exceeded traditional statistical classification procedure accuracy of 
93%. When comparing the three neural-network methodologies, the backpropagation method not only 
achieved a higher classification accuracy, but also had less computational requirements. 

B. Com. Xiao Chena et.al. [4] developed a two-stage classifier for identifying, which combined the 
Mahalanobis distance and BPNN classifier. Experiments showed the average classification accuracy for five 
corn varieties was up to 90%. It was found, that the method combining the Mahalanobis distance and BPNN 
classifier may be successfully employed for corn variety identification. 

Cao Weishi et.al. [5] Selected the average of every band as the input samples for BP neural network, and 
purity identification results of maize seed as the output samples of neural network. Results demonstrated that 
this method can identify the maize purity effectively with accurate identification rate reaching 94.5%. 
Min zhao et.al. [6] Applied SVM for classification by optimal combination of features. The algorithm 
implemented in this research was able to correctly classify the three varieties of corn. From 50 images of 
corn seed, 20 images were taken as training samples, and 30 images for testing samples. Applying the 
presented SVM classifier to estimate varieties resulted in a classification rate of 94.4%. The average 
consumption time for every seed was 0.141s. 

KantipKiratiratanapruk [13] adopted Support Vector Machine (SVM) for seed classification. In error case, 
defect seed types were normally misclassified into other defect seed types. Percentage of misclassification of 
defect seed types to normal seed type was only 0.64 on average. On the other side, a normal seed 
wasmisidentified as defect seed at a higher percentage of 4.44%. Color and texture feature were provided to 
support vector machine for training and identification of the unknown seed type. In the experiment, this 
technique was evaluated from 14,000 seed sample images of a normal seed type and 13 defect seed types. 
The obtained accuracies were 76% and 56% for individual feature separately whereas 81.8% for combination 
of both color and texture. 

C. Arecanuts. Kuo-Yi Huang [9] used a back propagation neural network (BPNN) to classify areca nuts into 
excellent, good or bad classes. The BPNN classifier consists of three layers: an input layer, a hidden layer, 
and an output layer. The input layer had 10 nodes related to SR area, 3 color features, and 6 geometric 
features aforementioned. The output layer was made of nodes related to three categories: Excellent, Good, 
and Bad. There were 144 samples, including 49 Excellent, 46 Good, and 49 Bad, which were randomly 
sampled from 287 images, where the 50%-50% splitting was used in order to establish the BPNN classifier. 
Eighteen hidden nodes were obtained according to Eq. (1) by using 10 input features, 3 output categories, and 
144 input samples. 

n h = [(n i +n )-^2] + (n p ) 05 (1) 

Whereni is the number of input nodes n is the number of output nodes, and n p is the number of input patterns 
in the training set. The accuracies of classification were 91.7%, 89.1%, and 91.8% for Excellent, Good, and 
Bad grades, respectively. The average accuracy was 90.9%. The total numbers of correct and erroneous 
classifications were 130 and 13, respectively. 

D. Wlieat. AlirezaPourreza et.al. [7] Employed a linear discriminant analysis (LDA) classifier to classify the 
wheat seed samples into nine classes based on nine varieties using top selected features. It was discovered 
that the LDA classifier presented the maximum average accuracy of 98.15% (ranged from 88.33% to 100%). 
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Marian Wiwart et.al. [8] adopted principal component analysis (PCA) for discrimination. The grain images 
of three common wheat varieties, five spelt breeding lines and 24 single hybrids between wheat and spelt 
were subjected to (PCA). PCA supported strong discrimination of the studied forms as regards their shape 
and color descriptors. In the PCA investigating shape descriptors only, the percentage of variation explained 
by the first two PCs reached a high 98.98%, whereas it was 90.27% for color descriptors. The PCA of 
variables describing the shape and color of grain images supports reliable discrimination of hybrids and their 
parental forms. The data were subjected to a principal component analysis (PCA) for three times, for shape 
descriptors only, for color attributes only, and for all analyzed variables. 

E. Cotton. Li Jingbin et.al. [10] Used the BP neural network to train the training set. The network structure 
consists of three layers. The number of nodes of the input layer was 9. Three cottonseed varieties were tested 
at the same time, and therefore, the number of nodes of the output layer was 3. The number of nodes of the 
hidden layer was 1 1 . BP neural network identification method had higher accuracy than the step 
discrimination analysis method. 

Jamuna KS et.al. [11] showed that Multilayer perceptron and decision tree classifiers predicts better than 
Naive Bayes algorithm. Among the three classifiers used for the experiment, the decision tree induction 
algorithm (J48) and Multilayer perceptron algorithm provides same prediction accuracy. The accuracy rate of 
Naive Bayes classifier is less compared to other models. Multilayer perceptron, the neural network classifier 
consumes more time to build the model. The Naive Bayes, the probabilistic classifier tends to learn more 
rapidly for the given dataset. There was a little statistical difference in the time taken to build the decision 
tree model and probabilistic model. J48 can be employed in the agriculture domain to predict the quality of 
the cotton seed. 

F. Oats, Barley, Lentil. Adjemoutouiza et.al. [14] Carried out the recognition which consists in affecting an 
unknown seed to its class on the basis of the nearest Euclidean distance calculated between the feature vector 
of the unknown seed and the average feature vector of each cluster. The unknown seed was affected to the 
cluster corresponding to the smallest distance. After testing more than four hundred seeds of each class, the 
average of recognition rate reached was 85.75 %. Because of their circular shapes, lentils were recognized at 
100%. With its compact shape, the corn recognition rate reached 99%. For the oats and barley, the 
recognition rates were 97% and 47% respectively. Some seeds of barley were confused with oats and, 
conversely, some seeds of oats were confused with barley due to the similarities between the sizes and the 
stretched shapes. 

G. Rubber. HadzliHashim et.al. [15] Used multi-layer perceptron Artificial Neural Network (ANN) using 
Levenberg-Marquardt algorithm. In this algorithm, the network assembles a set of training data that contains 
examples of inputs together with the corresponding outputs and later learns to infer the relationship between 
the two. Two models were proposed for automated seed clones classification. The ratio between best clone 
(RRIM2009) and nonbest clones (RRIM2005 and RRIM2016) for training set was fixed at 110:40:40. After 
training, the network was tested with the test data set of 100 samples and the ratio between the clones was 
50:25:25. The optimized model was later evaluated and validated through analysis of performance indicators. 



Seed and Classifiers 




Fig. 3: Seed and Classifiers 

77 



VI. The kinds of conclusions expected and their possible value 

The use of good quality seed increases the crop yield, decreases the number of seeds that need to be sown 
and reduces the carryover of weeds, insects and diseases. However, when the seed quality tests are done 
manually it adversely affects the results and adds to the man-hours spent on the test. Thus, this research 
would help in making the seed test accurate due to which the quality of the various crop seeds may improve 
by identifying relevant, robust, invariant features for reliably discriminating the seed of interest and designing 
efficient and accurate classifier. It can be concluded from Fig.2 that color and shape have the most 
discriminating power, but depending upon what type of seed is to be classified the feature that is extracted 
varies and from Fig. 3 It can be concluded that neural networks is widely used for classification. 
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